* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Using transformation:log
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
45 data positively skewed data were transformed:
carepf1
carepf2
carepf3
carepf4
carepf5
carepf6
carepf8
carepf9
carepf10
carepf11
carepf12
carepf13
carepf14
carepf15
carepf16
carepf17
carepf18
carepf19
carepf20
carepf21
carepf22
carepf23
carepf24
carepf25
carepf26
carepf27
carepf28
carepf29
carepf30
duq3
duq6
duq9
duq10
duq17
duq20
duq21
duq22
duq23
duq24
duq25
duq29
duq30
duq31
duq32
duq33
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Dropping 30 positively skewed data that could not be transformed successfully:
carepf1
carepf3
carepf4
carepf10
carepf11
carepf12
carepf14
carepf16
carepf19
carepf21
carepf22
carepf23
carepf24
carepf25
carepf27
carepf29
duq3
duq6
duq9
duq17
duq21
duq22
duq23
duq24
duq25
duq29
duq30
duq31
duq32
duq33
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
No negatively skewed variables found.* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
This notebook contains exploratory analyses of behavioral data collected to investigate the relationship between risk taking behavior and probabilistic learning.
The sample consists of three age groups: kids, teens and adults and we hypothesize that sensitivity to learn from high variance feedback improves with age (and this is related to better risky decisions).
Subjects completed a probabilistic learning task in the scanner, a risky decision making task (BART) outside the scanner and numerous questionnaires. The focus of this notebook is on the first task.
The plan of analysis is to establish that adults are more sensitive to high variance feedback in the probabilistic learning task and relate this (modeled) sensitivity to behavior both in BART and other self-reported risky behaviors. Details of correlations are found here
First let’s get a sense of the sample. Here is how many subjects we have who have complete datasets for the probabilistic learning task and their age break downs.
machine_game_data_clean %>%
group_by(age_group) %>%
summarise(min_age = min(calc_age),
mean_age = mean(calc_age),
sd_age = sd(calc_age),
max_age = max(calc_age),
n = ceiling(n()/180))
This task is a modified Iowa Gambling Task. Subjects are presented with a fractal in each trial. The fractals represent different machines (single-armed bandits). Subjects choose to play or pass in each trial. Each machine yields a probabilistic reward. There are four machines in total. Two with positive and two with negative expected value. One of each of these machines has a low variance reward schedule while the other has a high variance reward schedule.
Performance in this task can be assessed by looking at the total number of points subjects make at the end of task. The following graph shows that adults collect more points in this task compared to kids.
machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
summarise(total_points = sum(Points_earned)) %>%
do(assign.age.info(.)) %>%
group_by(age_group) %>%
summarise(mean_points = mean(total_points),
sem_points = sem(total_points)) %>%
ggplot(aes(age_group, mean_points))+
geom_bar(stat='identity', position = position_dodge((0.9)))+
geom_errorbar(aes(ymin=mean_points-sem_points, ymax=mean_points+sem_points), position = position_dodge(0.9), width=0.25)+
theme_bw()+
xlab('Machine')+
ylab('Mean points')+
labs(fill='Age group')
This difference is statistically significant: adults earn more points compared to the kids.
tmp = machine_game_data_clean %>%
group_by(Sub_id) %>%
summarise(total_points = sum(Points_earned)) %>%
do(assign.age.info(.))
summary(lm(total_points~age_group, data=tmp))
Call:
lm(formula = total_points ~ age_group, data = tmp)
Residuals:
Min 1Q Median 3Q Max
-2591.0 -946.9 -47.5 1108.7 2551.2
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 226.2 245.4 0.922 0.359806
age_groupteen 442.6 360.7 1.227 0.223858
age_groupadult 1379.8 384.1 3.592 0.000601 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1322 on 71 degrees of freedom
Multiple R-squared: 0.1552, Adjusted R-squared: 0.1314
F-statistic: 6.519 on 2 and 71 DF, p-value: 0.002515
Since we are interested in the age differences between sensitivity to different feedback schedules, we should show that this difference in performance exists especially for the high variance feedback condition(s). Here is the plot of performance (total points earned) broken down by conditions.
machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
summarise(total_points = sum(Points_earned)) %>%
do(assign.age.info(.)) %>%
group_by(age_group, facet_labels) %>%
summarise(mean_points = mean(total_points),
sem_points = sem(total_points)) %>%
ggplot(aes(facet_labels, mean_points, fill=age_group))+
geom_bar(stat='identity', position = position_dodge((0.9)))+
geom_errorbar(aes(ymin=mean_points-sem_points, ymax=mean_points+sem_points), position = position_dodge(0.9), width=0.25)+
# theme_bw()+
xlab('Machine')+
ylab('Mean points')+
labs(fill='Age group')
ggsave("Points_earned.jpeg", device = "jpeg", path = fig_path, width = 7, height = 5, units = "in", dpi = 450)
Running separate models for positive and negative EV machines for ease of interpretation.
tmp <- machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
summarise(total_points = sum(Points_earned)) %>%
do(assign.age.info(.))
In the positive EV machines there is a main effect for the high variance machine. Subjects earn fewer points in the high variance condition compared to the low variance condition. There are no age differences.
summary(lm(total_points ~ age_group*facet_labels, data = tmp %>% filter(facet_labels %in% c("-10,+100", "-5,+495"))))
Call:
lm(formula = total_points ~ age_group * facet_labels, data = tmp %>%
filter(facet_labels %in% c("-10,+100", "-5,+495")))
Residuals:
Min 1Q Median 3Q Max
-1477.0 -297.2 144.8 329.5 813.0
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1454.483 89.652 16.224 <2e-16
age_groupteen 191.517 131.761 1.454 0.1483
age_groupadult 290.017 140.327 2.067 0.0406
facet_labels-5,+495 -289.655 126.787 -2.285 0.0238
age_groupteen:facet_labels-5,+495 -151.945 186.338 -0.815 0.4162
age_groupadult:facet_labels-5,+495 7.155 198.453 0.036 0.9713
(Intercept) ***
age_groupteen
age_groupadult *
facet_labels-5,+495 *
age_groupteen:facet_labels-5,+495
age_groupadult:facet_labels-5,+495
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 482.8 on 142 degrees of freedom
Multiple R-squared: 0.164, Adjusted R-squared: 0.1346
F-statistic: 5.572 on 5 and 142 DF, p-value: 0.000102
In the negative EV machines there is again a main effect for the high variance machine: Everyone losses fewer points in the low variance condition. There is also a main effect for adults: Adults perform better than kids for both negative EV machines.
summary(lm(total_points ~ age_group*facet_labels, data = tmp %>% filter(facet_labels %in% c("+10,-100", "+5,-495"))))
Call:
lm(formula = total_points ~ age_group * facet_labels, data = tmp %>%
filter(facet_labels %in% c("+10,-100", "+5,-495")))
Residuals:
Min 1Q Median 3Q Max
-1290.00 -373.45 1.72 402.95 1017.07
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -951.03 89.33 -10.646 < 2e-16
age_groupteen 78.63 131.29 0.599 0.550156
age_groupadult 355.53 139.82 2.543 0.012069
facet_labels+5,-495 -491.03 126.33 -3.887 0.000155
age_groupteen:facet_labels+5,-495 54.23 185.67 0.292 0.770631
age_groupadult:facet_labels+5,-495 81.53 197.74 0.412 0.680714
(Intercept) ***
age_groupteen
age_groupadult *
facet_labels+5,-495 ***
age_groupteen:facet_labels+5,-495
age_groupadult:facet_labels+5,-495
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 481 on 142 degrees of freedom
Multiple R-squared: 0.2572, Adjusted R-squared: 0.2311
F-statistic: 9.835 on 5 and 142 DF, p-value: 4.331e-08
So the age diffence in performance is driven by difference in performance in negative EV machines. The question is what difference in behavior in these conditions is leading to this difference in performance?
To anticipate possible cognitive processes that will be parameterized in RL models differences can lie in: how quickly the groups learn the probabilities, how much weight they put on the outcomes and/or how much like an optimal agent they behave.
The first thing we can look at is how often subjects play versus pass. It’s hard to see any age differences when we just look at frequency of overall playing as below.
machine_game_data_clean %>%
group_by(Sub_id, Response) %>%
tally %>%
group_by(Sub_id) %>%
mutate(pct=(100*n)/sum(n)) %>%
do(assign.age.info(.)) %>%
group_by(age_group, Response) %>%
dplyr::summarise(mean_pct = mean(pct),
sem_pct = sem(pct)) %>%
ggplot(aes(Response, mean_pct, fill = age_group))+
geom_bar(stat='identity', position = position_dodge(0.9))+
geom_errorbar(aes(ymin = mean_pct - sem_pct, ymax = mean_pct + sem_pct), position = position_dodge(width = 0.9), width=0.25)+
theme_bw()+
ylab('Percentage of trials')+
labs(fill = 'Age group')
It is also not immediately apparent how to translate this to better performance/learning in this task but one way to think about it: If people learned perfectly they should play half of the time (always for the positive expected value trial and never for the negative expected value trials). The fact that all play proportions are above 50% suggests that nobody learns perfectly and that adults might be closest to it. But this is very crude and a better way to look at it would be to see
To get a better sense of overall behavior in different contingency states we break this proportion of playing down by machines.
Now we can see age differences in playing frequency in different conditions, particularly in the negative expected value machines (bottom row).
machine_game_data_clean %>%
group_by(Sub_id, facet_labels, Response) %>%
tally %>%
group_by(Sub_id, facet_labels) %>%
mutate(pct=(100*n)/sum(n)) %>%
do(assign.age.info(.)) %>%
group_by(age_group, facet_labels, Response) %>%
summarise(mean_pct = mean(pct),
sem_pct = sem(pct)) %>%
ggplot(aes(Response, mean_pct, fill = age_group))+
geom_bar(stat='identity', position = position_dodge(0.9))+
geom_errorbar(aes(ymin = mean_pct - sem_pct, ymax = mean_pct + sem_pct), position = position_dodge(width = 0.9), width=0.25)+
ylab('Percentage of trials')+
facet_wrap(~facet_labels)+
labs(fill = 'Age group')
ggsave("Prop_played.jpeg", device = "jpeg", path = fig_path, width = 8, height = 5, units = "in", dpi = 450)
The differences in points earned map directly on to proportion of choosing to play each machine:
tmp <- machine_game_data_clean %>%
group_by(Sub_id, facet_labels, Response) %>%
tally %>%
group_by(Sub_id, facet_labels) %>%
mutate(pct_play=(100*n)/sum(n)) %>%
filter(Response == 'play') %>%
do(assign.age.info(.))
summary(lmer(pct_play ~ age_group*facet_labels + (1|Sub_id), data = tmp))
Linear mixed model fit by REML ['lmerMod']
Formula: pct_play ~ age_group * facet_labels + (1 | Sub_id)
Data: tmp
REML criterion at convergence: 2602.4
Scaled residuals:
Min 1Q Median 3Q Max
-2.66682 -0.69780 -0.00607 0.71675 1.97256
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 61.23 7.825
Residual 436.76 20.899
Number of obs: 296, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error t value
(Intercept) 74.187 4.144 17.903
age_groupteen 7.857 6.090 1.290
age_groupadult 12.368 6.486 1.907
facet_labels-5,+495 -16.853 5.488 -3.071
facet_labels+10,-100 -28.214 5.488 -5.141
facet_labels+5,-495 -10.008 5.488 -1.824
age_groupteen:facet_labels-5,+495 -4.924 8.066 -0.611
age_groupadult:facet_labels-5,+495 2.742 8.591 0.319
age_groupteen:facet_labels+10,-100 -12.052 8.066 -1.494
age_groupadult:facet_labels+10,-100 -29.341 8.591 -3.416
age_groupteen:facet_labels+5,-495 -13.903 8.066 -1.724
age_groupadult:facet_labels+5,-495 -34.325 8.591 -3.996
Correlation of Fixed Effects:
(Intr) ag_grpt ag_grpd f_-5,+ f_+10, f_+5,-
age_grouptn -0.680
age_gropdlt -0.639 0.435
fct_-5,+495 -0.662 0.451 0.423
fc_+10,-100 -0.662 0.451 0.423 0.500
fct_+5,-495 -0.662 0.451 0.423 0.500 0.500
ag_grpt:_-5,+495 0.451 -0.662 -0.288 -0.680 -0.340 -0.340
ag_grpd:_-5,+495 0.423 -0.288 -0.662 -0.639 -0.319 -0.319
ag_grpt:_+10,-100 0.451 -0.662 -0.288 -0.340 -0.680 -0.340
ag_grpd:_+10,-100 0.423 -0.288 -0.662 -0.319 -0.639 -0.319
ag_grpt:_+5,-495 0.451 -0.662 -0.288 -0.340 -0.340 -0.680
ag_grpd:_+5,-495 0.423 -0.288 -0.662 -0.319 -0.319 -0.639
ag_grpt:_-5,+495 ag_grpd:_-5,+495 ag_grpt:_+10,-100
age_grouptn
age_gropdlt
fct_-5,+495
fc_+10,-100
fct_+5,-495
ag_grpt:_-5,+495
ag_grpd:_-5,+495 0.435
ag_grpt:_+10,-100 0.500 0.217
ag_grpd:_+10,-100 0.217 0.500 0.435
ag_grpt:_+5,-495 0.500 0.217 0.500
ag_grpd:_+5,-495 0.217 0.500 0.217
ag_grpd:_+10,-100 ag_grpt:_+5,-495
age_grouptn
age_gropdlt
fct_-5,+495
fc_+10,-100
fct_+5,-495
ag_grpt:_-5,+495
ag_grpd:_-5,+495
ag_grpt:_+10,-100
ag_grpd:_+10,-100
ag_grpt:_+5,-495 0.217
ag_grpd:_+5,-495 0.500 0.435
This is not surprising given what the number of points earned already showed. But now that we are looking at a behavioral measure instead of an outcome measure we might be able to quantify constructs of interest like sensitivity to variance or sensitivity to the expected values of the machines.
As a first step to translate raw playing behavior to learning I recoded the choices to be correct when a subject chooses to play a positive expected value machine and pass a negative expected value machine and incorrect when the reverse is true. If a subject is learning they should be learning to play the positive expected machines and to pass the others.
Recoding the behavior in this way gave a clearer picture of the age difference in learning of optimal behavior between the conditions. Specifically we can now look at how the probability of a correct choice changes for each age group in each condition across trials.
machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
mutate(rel_tm = 1:n()) %>%
# ggplot(aes(scale(Trial_number), correct1_incorrect0))+
ggplot(aes(rel_tm, correct1_incorrect0))+
geom_line(aes(group = Sub_id, col= factor(age_group, levels=c('kid', 'teen', 'adult'))),stat='smooth', method = 'glm', method.args = list(family = "binomial"), se = FALSE, alpha=0.2)+
geom_line(aes(col= factor(age_group, levels=c('kid', 'teen', 'adult'))),stat='smooth', method = 'glm', method.args = list(family = "binomial"), se = FALSE, alpha=1, size=2)+
facet_wrap(~facet_labels)+
theme_bw()+
# xlab("Relative trial number")+
xlab("Trial number")+
scale_y_continuous(breaks=c(0,1))+
labs(col="Age group")+
ylab('Correct choice')+
theme(legend.position = "bottom",
panel.grid = element_blank())
ggsave("Learning.jpeg", device = "jpeg", path = fig_path, width = 8, height = 5, units = "in", dpi = 450)
Effect of EV: Comparing positive EV to negative EV (the two rows) There is no real learning, significant change in behavior across time for the positive EV machines while there is for the negative EV machines.
Effect of variance: Comparing high var to low var (the two cols). Here there is an interaction: there is no effect of variance for the positive EV machines but there is an effect for the negative EV machines such that learning from high var is harder for kids for negative EV.
So the smaller the EV the more learning on average (for all age groups) unless the outcomes are too variable, in which case kids don’t learn from negative EV either
Looking at learning effects separately for each machine to avoid interpreting messy three-way interactions.
Adults are more likely to make correct decisions in low var positive EV machine.
summary(glmer(correct1_incorrect0 ~ age_group*scale(Trial_number)+(1|Sub_id), data = machine_game_data_clean %>% filter(facet_labels %in% c('-10,+100')), family=binomial))
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: correct1_incorrect0 ~ age_group * scale(Trial_number) + (1 |
Sub_id)
Data:
machine_game_data_clean %>% filter(facet_labels %in% c("-10,+100"))
AIC BIC logLik deviance df.resid
2749.0 2791.7 -1367.5 2735.0 3313
Scaled residuals:
Min 1Q Median 3Q Max
-5.9273 0.1141 0.2178 0.4711 1.9148
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 2.424 1.557
Number of obs: 3320, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.43430 0.30300 4.734 2.2e-06
age_groupteen 0.66864 0.44990 1.486 0.13723
age_groupadult 1.51209 0.49771 3.038 0.00238
scale(Trial_number) 0.03214 0.06979 0.460 0.64518
age_groupteen:scale(Trial_number) 0.08535 0.11070 0.771 0.44069
age_groupadult:scale(Trial_number) -0.03529 0.13146 -0.268 0.78835
(Intercept) ***
age_groupteen
age_groupadult **
scale(Trial_number)
age_groupteen:scale(Trial_number)
age_groupadult:scale(Trial_number)
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) ag_grpt ag_grpd sc(T_) ag_grpt:(T_)
age_grouptn -0.667
age_gropdlt -0.597 0.411
scl(Trl_nm) 0.005 -0.004 -0.003
ag_grpt:(T_) -0.003 0.012 0.003 -0.631
ag_grpd:(T_) -0.003 0.002 0.001 -0.531 0.335
The probability of making a correct response for the high var positive EV machine doesn’t change for adults or kids but increases for teens across trials.
summary(glmer(correct1_incorrect0 ~ age_group*scale(Trial_number)+(1|Sub_id), data = machine_game_data_clean %>% filter(facet_labels %in% c('-5,+495')), family=binomial))
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: correct1_incorrect0 ~ age_group * scale(Trial_number) + (1 |
Sub_id)
Data:
machine_game_data_clean %>% filter(facet_labels %in% c("-5,+495"))
AIC BIC logLik deviance df.resid
3523.1 3565.8 -1754.5 3509.1 3312
Scaled residuals:
Min 1Q Median 3Q Max
-4.3340 -0.7119 0.2991 0.5720 3.9680
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 2.152 1.467
Number of obs: 3319, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.430022 0.281526 1.527 0.1266
age_groupteen 0.170332 0.414035 0.411 0.6808
age_groupadult 1.017752 0.447168 2.276 0.0228 *
scale(Trial_number) 0.025711 0.064116 0.401 0.6884
age_groupteen:scale(Trial_number) 0.009261 0.096280 0.096 0.9234
age_groupadult:scale(Trial_number) 0.286436 0.111048 2.579 0.0099 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) ag_grpt ag_grpd sc(T_) ag_grpt:(T_)
age_grouptn -0.680
age_gropdlt -0.629 0.428
scl(Trl_nm) 0.003 -0.002 -0.002
ag_grpt:(T_) -0.002 0.002 0.001 -0.666
ag_grpd:(T_) -0.001 0.001 0.023 -0.577 0.385
All groups show improvement across trials for the low var negative EV machine but adults learn faster than kids and teens.
summary(glmer(correct1_incorrect0 ~ age_group*scale(Trial_number)+(1|Sub_id), data = machine_game_data_clean %>% filter(facet_labels %in% c('+10,-100')), family=binomial))
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: correct1_incorrect0 ~ age_group * scale(Trial_number) + (1 |
Sub_id)
Data:
machine_game_data_clean %>% filter(facet_labels %in% c("+10,-100"))
AIC BIC logLik deviance df.resid
3941.9 3984.6 -1963.9 3927.9 3316
Scaled residuals:
Min 1Q Median 3Q Max
-4.4795 -0.8657 0.3608 0.7405 3.7111
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 1.039 1.019
Number of obs: 3323, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.11697 0.20051 -0.583 0.559665
age_groupteen 0.46945 0.29375 1.598 0.110018
age_groupadult 1.19288 0.31646 3.769 0.000164
scale(Trial_number) 0.28413 0.06329 4.489 7.14e-06
age_groupteen:scale(Trial_number) 0.07164 0.09125 0.785 0.432409
age_groupadult:scale(Trial_number) 0.41654 0.10770 3.868 0.000110
(Intercept)
age_groupteen
age_groupadult ***
scale(Trial_number) ***
age_groupteen:scale(Trial_number)
age_groupadult:scale(Trial_number) ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) ag_grpt ag_grpd sc(T_) ag_grpt:(T_)
age_grouptn -0.683
age_gropdlt -0.635 0.434
scl(Trl_nm) 0.000 0.001 0.001
ag_grpt:(T_) 0.000 0.007 0.000 -0.693
ag_grpd:(T_) -0.002 0.002 0.054 -0.586 0.407
Kids don’t show learning across trials for the high var negative EV machine but adults and teens do.
summary(glmer(correct1_incorrect0 ~ age_group*scale(Trial_number)+(1|Sub_id), data = machine_game_data_clean%>% filter(facet_labels %in% c('+5,-495')), family=binomial))
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: correct1_incorrect0 ~ age_group * scale(Trial_number) + (1 |
Sub_id)
Data:
machine_game_data_clean %>% filter(facet_labels %in% c("+5,-495"))
AIC BIC logLik deviance df.resid
3769.2 3812.0 -1877.6 3755.2 3321
Scaled residuals:
Min 1Q Median 3Q Max
-2.9608 -0.6785 -0.3732 0.7556 3.7464
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 1.246 1.116
Number of obs: 3328, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.96663 0.21941 -4.406 1.05e-05
age_groupteen 0.48330 0.32039 1.508 0.131
age_groupadult 1.34244 0.34305 3.913 9.11e-05
scale(Trial_number) 0.03858 0.06522 0.591 0.554
age_groupteen:scale(Trial_number) 0.36695 0.09365 3.918 8.92e-05
age_groupadult:scale(Trial_number) 0.88273 0.11498 7.677 1.63e-14
(Intercept) ***
age_groupteen
age_groupadult ***
scale(Trial_number)
age_groupteen:scale(Trial_number) ***
age_groupadult:scale(Trial_number) ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) ag_grpt ag_grpd sc(T_) ag_grpt:(T_)
age_grouptn -0.684
age_gropdlt -0.640 0.438
scl(Trl_nm) -0.003 0.002 0.002
ag_grpt:(T_) 0.001 -0.011 -0.001 -0.696
ag_grpd:(T_) -0.003 0.001 0.031 -0.567 0.397
I tried to capture these effects in ‘individual difference’ variables by running the logistic regression separately for each subject in each condition. This wouldn’t capture anything different than the above analyses but I wanted to see if there were any subject-specific indices that could be correlated with other measues. I looked at three parameters:
Because each model is run only on 45 trials the fits aren’t great and the parameter distributions have large variances.
get_learning_coef <- function(data){
model = glm(correct1_incorrect0 ~ scale(Trial_number), family = binomial(link=logit), data = data)
b0 = coef(model)[1]
b1 = coef(model)[2]
learnIndex = -b0/b1
return(data.frame(b0, b1, learnIndex))
}
tmp = machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
do(get_learning_coef(.)) %>%
do(assign.age.info(.))
(Error bars not shown because they are very large due to bad fits).
As expected the difference between kids and adults in slopes for the high variance negative EV machine is visible here too.
tmp %>%
ungroup()%>%
select(facet_labels, age_group, b0, b1, learnIndex) %>%
gather(key, value, -facet_labels, -age_group) %>%
group_by(age_group, facet_labels, key) %>%
summarise(mv = median(value),
sv = sem(value)) %>%
ggplot(aes(facet_labels, mv, fill=age_group))+
geom_bar(stat="identity", position = position_dodge())+
# geom_errorbar(aes(ymin = mv-sv, ymax = mv+sv), position = position_dodge(width = 0.9), width=0)+
facet_wrap(~key, scale="free")+
theme(legend.position = "bottom",
legend.title = element_blank())+
xlab("")+
ylab("Median value")
But it’s not a good idea to look for group differences in these parameters as they are highly variable due to bad fits from few trials.
Does it makes sense to look at these separately?
Since the machines differ in the variance of the outcomes and expected values it might seem sensible to look at which of these attributes has a larger effect on performance.
It’s tempting to tease apart the relative importance of these attributes for the high variance negative EV machine where we observe the performance difference between age groups.
BUT these attributes are correlated. So we can’t look at their effects separately in the same model.
#Function to calculate observed variance and observed expected value based on outcomes in trials that the subject has played.
get_obs_var_ev <- function(data){
new_data = data
new_data$obs_var <- NA
new_data$obs_ev <- NA
for(i in 1:nrow(new_data)){
if(i == 1){
obs = 0
obs_ev = 0
obs_var = 0
}
else{
#get all the trials until the current trial
obs = new_data[1:i,]
#filter only played trials; their belief should not be updated based on the trials they haven't played
obs = obs %>% filter(Response == "play") %>% ungroup() %>% select(Points_earned)
obs_var = var(obs)
obs_probs = as.numeric(prop.table(table(obs)))
obs_rewards = as.numeric(names(prop.table(table(obs))))
obs_ev = sum(obs_probs*obs_rewards)
}
new_data$obs_var[i] = obs_var
new_data$obs_ev[i] = obs_ev
}
new_data$obs_var = ifelse(is.na(new_data$obs_var), 0, new_data$obs_var)
return(new_data)
}
tmp = machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
do(get_obs_var_ev(.))
tmp %>%
ggplot(aes(obs_var, obs_ev))+
geom_point()+
facet_wrap(~facet_labels, scales="free")+
xlab("Observed variance")+
ylab("Observed EV")
What we are interested in is the effect of beliefs about the machines on behavior. These beliefs can be summarized quantitatively in an ‘expected value.’
The cognitive processes that can differ with respect to this expected value can be how quickly it approaches the true expected value of a machine (the rate at which one incorporates each new data point to existing beliefs) and how truthfully the expected values are evaluated (is the utility of the expected value the same as its value).
These two processes can be captured as the learning rate and the exponent on the prediction error in an RL model.
Before moving on to modeling results here I plot the effect of observed EV (not model based) on choice to confirm that it makes sense and captures the behavioral effect:
The higher the EV of a machine the more likely it is to be played. This is the correct action for the positive EV machines but incorrect action for the negative EV machines. The behavioral effect in the high var negative EV machine is captured again with the diverging lines for age groups at low EVs.
tmp %>%
ggplot(aes(obs_ev, correct1_incorrect0))+
geom_line(aes(group = Sub_id, col= age_group),stat='smooth', method = 'glm', method.args = list(family = "binomial"), se = FALSE, alpha=0.2)+
geom_line(aes(col= age_group),stat='smooth', method = 'glm', method.args = list(family = "binomial"), se = FALSE, alpha=1, size=2)+
facet_wrap(~facet_labels, scales='free')+
xlab("EV of played trials")+
scale_y_continuous(breaks=c(0,1))+
labs(col="Age group")+
ylab('Correct')+
theme(legend.position = "bottom",
legend.title = element_blank())
Though I focus on learning behavior and specifically difference in learning for the high variance negative EV machine there are other possible behavioral patterns that might also differ between the age groups. Here I list some examples.
Do people ‘explore’ the first 10 trials where the reward probabilities for each machine are presented?
They explore less when they encounter a loss early on. In the high var pos EV machine they get 4 (small) losses in a row; in the low var negative EV machine they get a moderate loss in the first trial.
machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
slice(1:10) %>%
summarise(num_explored = sum(ifelse(Response == "play", 1,0))) %>%
do(assign.age.info(.)) %>%
ungroup() %>%
group_by(age_group, facet_labels) %>%
summarise(mean_num_explored = mean(num_explored/10*100),
sem_num_explored = sem(num_explored/10*100)) %>%
ggplot(aes(facet_labels, mean_num_explored, fill = age_group))+
geom_bar(stat="identity",position = position_dodge(0.9))+
geom_errorbar(aes(ymax = mean_num_explored+sem_num_explored, ymin = mean_num_explored-sem_num_explored), position = position_dodge(width = 0.9), width=0.25)+
theme(legend.title = element_blank())+
ylab("Percentage of exploration")+
xlab("")
How does performance change depending on the delay between the last time a machine was played?
Can we think of this as a ‘memory effect’? The more trials since the last time you have played a machine, the more forgetting/interference?
For positive EV machines this is true for all groups. This is evident in the decreasing probability of a correct response the longer it has been since the last time a machine was played.
For negative EV machines adults and teens continue to make correct choices even if a lot of trials have passed since they last played that machine. Kids don’t seem to remember that the machine is ‘bad’ and are more likely to make an incorrect choice (and play the machine) the longer it’s been since they last played it.
machine_game_data_clean %>%
group_by(Sub_id) %>%
mutate(played_trial_number = ifelse(Response == "play", Trial_number, NA)) %>%
mutate(played_trial_number = na.locf(played_trial_number, na.rm=F)) %>%
filter(Trial_number > 1) %>%
mutate(trials_since_last_played = Trial_number - lag(played_trial_number)) %>%
ggplot(aes(trials_since_last_played, correct1_incorrect0, col = age_group))+
geom_line(stat='smooth', method = 'glm', method.args = list(family = "binomial"), alpha=1, size=2)+
facet_wrap(~facet_labels)+
theme(legend.title = element_blank())+
xlab("Trials since last played")+
ylab("Correct")+
scale_y_continuous(breaks=c(0,1))
If subjects are sensitive to losses and learning something about the machines in a way that overweights their most recent experience with the machine one sanity check is to compare how many trials it takes subjects to play a machine again after a loss versus a gain. Presumably the former would be higher than the latter. One might hesitate to play a machine again after a loss but be more likely to play it after a gain.
count.postoutcome.trials <- function(subject_data){
loss_trials = which(subject_data$Points_earned<0)
gain_trials = which(subject_data$Points_earned>0)
play_trials= which(subject_data$Response == "play")
post_loss_trials = play_trials[which(play_trials %in% loss_trials)+1]
post_gain_trials = play_trials[which(play_trials %in% gain_trials)+1]
num_trials_post_loss = post_loss_trials - loss_trials
num_trials_post_gain = post_gain_trials - gain_trials
if(length(num_trials_post_gain)>length(num_trials_post_loss)){
num_trials_post_loss <- c(num_trials_post_loss, rep(NA, length(num_trials_post_gain) - length(num_trials_post_loss)))
}
else if(length(num_trials_post_gain)<length(num_trials_post_loss)){
num_trials_post_gain <- c(num_trials_post_gain, rep(NA, length(num_trials_post_loss) - length(num_trials_post_gain)))
}
return(data.frame(num_trials_post_loss = num_trials_post_loss, num_trials_post_gain = num_trials_post_gain))
}
The plot below shows the average number of trials it takes a subject to play a given machine after experiencing a loss or a gain.
For everyone and for every machine the average number of trials it takes a subject to play following a loss is higher than the average number of trials it take them to play following a gain. This suggests that subjects are responding to outcomes in a way overweights their most recent experience with the machine.
tmp = machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
do(count.postoutcome.trials(.)) %>%
do(assign.age.info(.)) %>%
ungroup() %>%
select(facet_labels, age_group, num_trials_post_loss, num_trials_post_gain, Sub_id) %>%
gather(key, value, -facet_labels, -age_group, -Sub_id) %>%
mutate(key = gsub("num_trials_post_", "", key))
tmp %>%
group_by(facet_labels, age_group, key) %>%
summarise(mean_post = mean(value, na.rm=T),
sem_post = sem(value)) %>%
ggplot(aes(age_group, mean_post, shape=key, col=age_group))+
geom_point(size=2)+
geom_errorbar(aes(ymin = mean_post-sem_post, ymax = mean_post+sem_post), width=0)+
facet_wrap(~facet_labels)+
ylab("Number of trials until next play")+
xlab("")+
theme(legend.title = element_blank())+
guides(color=FALSE)
Reflecting the global behavior in proportion of playing in each condition adults take longer to play after large losses in the high variance negative EV condition compared to kids while kids are less sensitive to the magnitude of loss.
summary(lm(value~age_group*facet_labels,tmp %>%filter(key=="loss")))
Call:
lm(formula = value ~ age_group * facet_labels, data = tmp %>%
filter(key == "loss"))
Residuals:
Min 1Q Median 3Q Max
-2.2500 -0.5442 -0.3193 -0.2433 26.8095
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.41387 0.07401 19.104 < 2e-16
age_groupteen -0.14114 0.10610 -1.330 0.18353
age_groupadult -0.17055 0.11157 -1.529 0.12643
facet_labels-5,+495 0.24867 0.09753 2.550 0.01082
facet_labels+10,-100 0.77661 0.11977 6.484 1.00e-10
facet_labels+5,-495 0.56288 0.18919 2.975 0.00295
age_groupteen:facet_labels-5,+495 0.02282 0.14045 0.162 0.87096
age_groupadult:facet_labels-5,+495 -0.17269 0.14511 -1.190 0.23410
age_groupteen:facet_labels+10,-100 0.38679 0.17779 2.175 0.02965
age_groupadult:facet_labels+10,-100 0.94702 0.20703 4.574 4.93e-06
age_groupteen:facet_labels+5,-495 0.33582 0.28074 1.196 0.23169
age_groupadult:facet_labels+5,-495 1.44381 0.32854 4.395 1.14e-05
(Intercept) ***
age_groupteen
age_groupadult
facet_labels-5,+495 *
facet_labels+10,-100 ***
facet_labels+5,-495 **
age_groupteen:facet_labels-5,+495
age_groupadult:facet_labels-5,+495
age_groupteen:facet_labels+10,-100 *
age_groupadult:facet_labels+10,-100 ***
age_groupteen:facet_labels+5,-495
age_groupadult:facet_labels+5,-495 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.615 on 3931 degrees of freedom
(1653 observations deleted due to missingness)
Multiple R-squared: 0.07136, Adjusted R-squared: 0.06876
F-statistic: 27.46 on 11 and 3931 DF, p-value: < 2.2e-16
What affects whether a subject plays after experiencing a loss? The magnitude of the loss they expired? How long they think before playing?
First look at how this changes across the task: Subject are more likely to be make the correct choice for the negative EV machines as the task goes by.
machine_game_data_clean %>%
mutate(losstrial = ifelse(Points_earned<0,1,0),
postloss = lag(losstrial),
postloss_play1_pass0 = ifelse(postloss == 1 & Response == "play",1, ifelse(postloss==1 & Response == "pass", 0, NA))) %>%
group_by(Sub_id, facet_labels) %>%
mutate(rel_trial = 1:n()) %>%
# ggplot(aes(rel_trial, postloss_play1_pass0))+
ggplot(aes(rel_trial, correct1_incorrect0))+
geom_smooth(aes(col=age_group), method='glm', method.args = list(family = "binomial"))+
facet_wrap(~facet_labels)+
scale_y_continuous(breaks=c(0,1))+
theme(legend.title = element_blank())+
xlab("Trial number")+
ylab("Probability of correct following a loss")
tmp = machine_game_data_clean %>%
mutate(losstrial = ifelse(Points_earned<0,1,0),
postloss = lag(losstrial),
postloss_play1_pass0 = ifelse(postloss == 1 & Response == "play",1, ifelse(postloss==1 & Response == "pass", 0, NA)),
lastlossamt = lag(Points_earned)) %>%
filter(postloss==1)
tmp %>%
ggplot(aes(Reaction_time, correct1_incorrect0))+
geom_smooth(aes(col=age_group), method='glm', method.args = list(family = "binomial"))+
facet_wrap(~facet_labels)+
scale_y_continuous(breaks=c(0,1))+
theme(legend.title = element_blank())+
xlab("RT")+
ylab("Probability of correct following a loss")
Baseline is -10,+100. Less likely to be correct in any of the machines after a loss compared to this baseline. Adults are more likely to be correct following a loss for all machines. There is also an effect of response time. The longer a decision takes the less likely it is to be correct. This is even stronger for adults (they are usually faster than kids but when they do take long they are even less likely to be correct).
If slower decisions are more likely to be incorrect would this suggest less of a drift process but more interference/uncertainty about knowledge on that machine instead?
summary(glmer(correct1_incorrect0 ~ facet_labels+scale(Reaction_time)*age_group+(1|Sub_id), tmp, family="binomial"))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
control$checkConv, : Model failed to converge with max|grad| = 0.00238101
(tol = 0.001, component 1)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula:
correct1_incorrect0 ~ facet_labels + scale(Reaction_time) * age_group +
(1 | Sub_id)
Data: tmp
AIC BIC logLik deviance df.resid
4298.2 4361.3 -2139.1 4278.2 4056
Scaled residuals:
Min 1Q Median 3Q Max
-5.4769 -0.7040 0.3079 0.6116 4.9598
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 0.6881 0.8295
Number of obs: 4066, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.47694 0.18462 8.000 1.25e-15
facet_labels-5,+495 -0.83666 0.11828 -7.073 1.51e-12
facet_labels+10,-100 -1.49635 0.11631 -12.865 < 2e-16
facet_labels+5,-495 -2.47730 0.12226 -20.262 < 2e-16
scale(Reaction_time) -0.29891 0.05874 -5.089 3.60e-07
age_groupteen 0.33622 0.24402 1.378 0.168244
age_groupadult 1.02993 0.26372 3.905 9.41e-05
scale(Reaction_time):age_groupteen 0.01876 0.09302 0.202 0.840152
scale(Reaction_time):age_groupadult -0.36051 0.10426 -3.458 0.000545
(Intercept) ***
facet_labels-5,+495 ***
facet_labels+10,-100 ***
facet_labels+5,-495 ***
scale(Reaction_time) ***
age_groupteen
age_groupadult ***
scale(Reaction_time):age_groupteen
scale(Reaction_time):age_groupadult ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) f_-5,+ f_+10, f_+5,- sc(R_) ag_grpt ag_grpd
fct_-5,+495 -0.382
fc_+10,-100 -0.381 0.616
fct_+5,-495 -0.366 0.599 0.621
scl(Rctn_t) -0.094 0.070 0.024 0.062
age_grouptn -0.603 -0.007 -0.022 -0.027 0.051
age_gropdlt -0.552 -0.012 -0.027 -0.050 0.046 0.430
scl(Rctn_tm):g_grpt 0.065 -0.048 -0.023 -0.063 -0.633 -0.038 -0.028
scl(Rctn_tm):g_grpd 0.049 -0.021 -0.033 -0.007 -0.559 -0.029 -0.060
scl(Rctn_tm):g_grpt
fct_-5,+495
fc_+10,-100
fct_+5,-495
scl(Rctn_t)
age_grouptn
age_gropdlt
scl(Rctn_tm):g_grpt
scl(Rctn_tm):g_grpd 0.353
convergence code: 0
Model failed to converge with max|grad| = 0.00238101 (tol = 0.001, component 1)
If one knew which were positive and which negative EV machines one would either always play for positive EV machines or never play for negative EV machines regardless of the observed outcome. So for the positive EV machines there would be no difference between gains/losses and for the negative EV machines there will be no points to plot (because it will never be played). The difference in behavior depending on the valence of the recently observed outcome (gain/loss) could be due to at least two reasons: memory or loss aversion. Or perhaps stronger memories for losses for adults. Do kids play the bad machine because they can’t remember how bad that machine is or they don’t care to loose as much? Perhaps there is something interesting to look at in the hippocampal activity following losses versus gains.
Studies that compute loss aversion present subjects with gambles where the amounts and probabilities are known. This is not the case for our paradigm (which is what makes it a learning task) which is why I estimate them as part of RL models later too. For the sake of it let’s assume subjects knew the gain and loss amounts for each machine and calculate loss aversion:
We don’t find a difference in the estimates between adults and kids. Neither had Barkley-Levenson et al. (2014).
get_loss_aversion = function(data){
data = data %>%
filter(Response != "time-out") %>%
mutate(play1_pass0 = ifelse(Response=="pass", 0,1),
gain_mag = as.numeric(gain_mag),
loss_mag = as.numeric(loss_mag))
m = glm(play1_pass0 ~ gain_mag+loss_mag, data, family="binomial")
loss_ave = -coef(m)[3]/coef(m)[2]
return(data.frame(loss_ave = loss_ave))
}
machine_game_data_clean %>%
group_by(Sub_id) %>%
do(get_loss_aversion(.)) %>%
do(assign.age.info(.)) %>%
ggplot(aes(log(loss_ave), fill=age_group))+
geom_density(alpha=0.4, color=NA)+
theme(legend.title = element_blank())
Are subjects less likely to play overall after a loss or only less likely to play that machine after a loss for that machine?
mean.postloss.play.prob <- function(subject_data){
loss_trials = which(subject_data$Points_earned<0)
mean_post_loss_prob <- mean(ifelse(subject_data$Response[loss_trials+1] == "play", 1, 0), na.rm=T)
return(data.frame(mean_post_loss_prob=mean_post_loss_prob))
}
Probability of playing following a loss depends on machine type. Looking at all trials masks this difference. Subjects seem to learn machine specifically and cross-talk isn’t evident here.
tmp = machine_game_data_clean %>%
group_by(Sub_id) %>%
do(mean.postloss.play.prob(.)) %>%
mutate(facet_labels = "all_trials")
machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
do(mean.postloss.play.prob(.)) %>%
rbind(tmp) %>%
do(assign.age.info(.)) %>%
group_by(age_group, facet_labels) %>%
summarise(mp = mean(mean_post_loss_prob,na.rm=T),
sp = sem(mean_post_loss_prob)) %>%
ggplot(aes(facet_labels, mp, fill=age_group))+
geom_bar(stat="identity",position=position_dodge())+
geom_errorbar(width=0, aes(ymin = mp-sp, ymax = mp+sp), position = position_dodge(width=0.9))+
xlab("")+
ylab("Post loss play probability")+
theme(legend.title = element_blank())
machine_game_data_clean %>%
ggplot(aes(log(Reaction_time))) +
geom_density(aes(fill = age_group), alpha=0.5, color=NA) +
facet_wrap(~facet_labels)+
theme(legend.title = element_blank())+
ylab("")+
xlab("Log Response Time")
machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
summarise(mean_log_rt = mean(log(Reaction_time)),
sem_log_rt = sem(log(Reaction_time))) %>%
do(assign.age.info(.)) %>%
ggplot(aes(age_group, mean_log_rt))+
geom_boxplot(aes(fill=age_group))+
facet_wrap(~facet_labels)+
theme(legend.position = "none")+
ylab("Mean Log Rt")+
xlab("Age group")
Both teens and adults are faster than kids in all conditions but the high var negative EV.
#summary(lmer(log(Reaction_time) ~ age_group*facet_labels +(1|Sub_id), data = machine_game_data_clean))
summary(lmer(log(Reaction_time) ~ age_group +(1|Sub_id), data = machine_game_data_clean%>%filter(facet_labels == "-10,+100")))
Linear mixed model fit by REML ['lmerMod']
Formula: log(Reaction_time) ~ age_group + (1 | Sub_id)
Data: machine_game_data_clean %>% filter(facet_labels == "-10,+100")
REML criterion at convergence: 3451.5
Scaled residuals:
Min 1Q Median 3Q Max
-3.5477 -0.6497 -0.1096 0.5750 3.9205
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 0.03539 0.1881
Residual 0.15641 0.3955
Number of obs: 3320, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.99902 0.03663 191.091
age_groupteen -0.19628 0.05382 -3.647
age_groupadult -0.24963 0.05732 -4.355
Correlation of Fixed Effects:
(Intr) ag_grpt
age_grouptn -0.681
age_gropdlt -0.639 0.435
summary(lmer(log(Reaction_time) ~ age_group +(1|Sub_id), data = machine_game_data_clean%>%filter(facet_labels == "-5,+495")))
Linear mixed model fit by REML ['lmerMod']
Formula: log(Reaction_time) ~ age_group + (1 | Sub_id)
Data: machine_game_data_clean %>% filter(facet_labels == "-5,+495")
REML criterion at convergence: 4006.2
Scaled residuals:
Min 1Q Median 3Q Max
-6.5685 -0.6518 -0.1165 0.6273 3.1221
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 0.04673 0.2162
Residual 0.18454 0.4296
Number of obs: 3319, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.99856 0.04188 167.096
age_groupteen -0.11935 0.06154 -1.939
age_groupadult -0.16069 0.06554 -2.452
Correlation of Fixed Effects:
(Intr) ag_grpt
age_grouptn -0.681
age_gropdlt -0.639 0.435
summary(lmer(log(Reaction_time) ~ age_group +(1|Sub_id), data = machine_game_data_clean%>%filter(facet_labels == "+10,-100")))
Linear mixed model fit by REML ['lmerMod']
Formula: log(Reaction_time) ~ age_group + (1 | Sub_id)
Data: machine_game_data_clean %>% filter(facet_labels == "+10,-100")
REML criterion at convergence: 3737.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.2684 -0.6826 -0.1054 0.6512 3.1234
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 0.02936 0.1713
Residual 0.17125 0.4138
Number of obs: 3323, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error t value
(Intercept) 7.03441 0.03383 207.948
age_groupteen -0.13692 0.04971 -2.754
age_groupadult -0.12265 0.05294 -2.317
Correlation of Fixed Effects:
(Intr) ag_grpt
age_grouptn -0.681
age_gropdlt -0.639 0.435
summary(lmer(log(Reaction_time) ~ age_group +(1|Sub_id), data = machine_game_data_clean%>%filter(facet_labels == "+5,-495")))
Linear mixed model fit by REML ['lmerMod']
Formula: log(Reaction_time) ~ age_group + (1 | Sub_id)
Data: machine_game_data_clean %>% filter(facet_labels == "+5,-495")
REML criterion at convergence: 3814.1
Scaled residuals:
Min 1Q Median 3Q Max
-6.0272 -0.6828 -0.0927 0.6238 3.8977
Random effects:
Groups Name Variance Std.Dev.
Sub_id (Intercept) 0.03376 0.1837
Residual 0.17454 0.4178
Number of obs: 3328, groups: Sub_id, 74
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.99134 0.03603 194.057
age_groupteen -0.06270 0.05295 -1.184
age_groupadult -0.13093 0.05639 -2.322
Correlation of Fixed Effects:
(Intr) ag_grpt
age_grouptn -0.680
age_gropdlt -0.639 0.435
How would you group learners vs. non-learners? Those who are more likely to make correct choices later in the task - so positive slope for the sigmoid?
tmp = machine_game_data_clean %>%
group_by(Sub_id, facet_labels) %>%
do(get_learning_coef(.)) %>%
do(assign.age.info(.)) %>%
mutate(learner = ifelse(b1>0,1,0))
with(tmp, table(learner, facet_labels, age_group))
, , age_group = kid
facet_labels
learner -10,+100 -5,+495 +10,-100 +5,-495
0 14 14 12 18
1 15 15 17 11
, , age_group = teen
facet_labels
learner -10,+100 -5,+495 +10,-100 +5,-495
0 12 10 7 10
1 13 15 18 15
, , age_group = adult
facet_labels
learner -10,+100 -5,+495 +10,-100 +5,-495
0 11 5 2 3
1 9 15 18 17
non_learners = tmp %>%
filter(facet_labels %in% c("+5,-495", "+10,-100")) %>%
filter(learner == 0)
non_learners = unique(non_learners$Sub_id)
non_learners
[1] 100003 100009 100042 100051 100057 100059 100063 100068 100105 100110
[11] 100129 100143 100169 100180 100185 100188 100207 100241 100243 100244
[21] 100250 200056 200085 200133 200162 200164 200168 200199 200211 306587
[31] 311047 311283 311444 311479 400742 407260 408394 411477
learner_info = data.frame(Sub_id = unique(machine_game_data_clean$Sub_id))
learner_info = learner_info %>%
mutate(learner = ifelse(Sub_id %in% non_learners == FALSE, 1, 0),
non_learner = ifelse(Sub_id %in% non_learners, 1, 0),
Sub_id = paste0('sub-', Sub_id))
write.csv(learner_info, '/Users/zeynepenkavi/Dropbox/PoldrackLab/DevStudy_ServerScripts/nistats/level_3/learner_info.csv', row.names = FALSE)
Or trials post-learning? [probably for imaging]
Details of model comparison can be found in a separate notebook.